Automatic generation of subword units for speech recognition systems

نویسندگان

Rita Singh

Bhiksha Raj

Richard M. Stern

چکیده

Large vocabulary continuous speech recognition (LVCSR) systems traditionally represent words in terms of smaller subword units. Both during training and during recognition, they require a mapping table, called the dictionary, which maps words into sequences of these subword units. The performance of the LVCSR system depends critically on the definition of the subword units and the accuracy of the dictionary. In current LVCSR systems, both these components are manually designed. While manually designed subword units generalize well, they may not be the optimal units of classification for the specific task or environment for which an LVCSR system is trained. Moreover, when human expertise is not available, it may not be possible to design good subword units manually. There is clearly a need for data-driven design of these LVCSR components. In this paper, we present a complete probabilistic formulation for the automatic design of subword units and dictionary, given only the acoustic data and their transcriptions. The proposed framework permits easy incorporation of external sources of information, such as the spellings of words in terms of a nonideographic script.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic clustering and generation of contextual questions for tied states in hidden Markov models

Most current automatic speech recognition systems based on HMMs cluster or tie together subsets of the subword units with which speech is represented. This tying improves recognition accuracy when systems are trained with limited data, and is performed by classifying the sub-phonetic units using a series of binary tests based on speech production, called “linguistic questions”. This paper descr...

متن کامل

Towards weakly supervised acoustic subword unit discovery and lexicon development using hidden Markov models

State-of-the-art automatic speech recognition and text-to-speech systems are based on subword units, typically phonemes. This necessitates a lexicon that maps each word to a sequence of subword units. Development of a phonetic lexicon for a language requires linguistic knowledge as well as human effort, which may not be always readily available, particularly for under-resourced languages. In su...

متن کامل

Incorporating linguistic knowledge and automatic baseform generation in acoustic subword unit based speech recognition

A major challenge in speech recognition based on acoustic subword units is creating a lexicon which is robust to interand intra-speaker variations. In this paper we present two di erent approaches for incorporating simple word-level linguistic knowledge into the labelling step of the training procedure. The proposed systems also utilise a scheme for combined optimisation of baseforms and subwor...

متن کامل

Constrained Subword Units for Speaker Recognition

Phonetic features have been proposed to overcome performance degradation in spectral speaker recognition in difficult acoustic conditions. The harmful effect of those conditions, however, is not restricted to spectral systems but also affects the performance of the open-loop phone recognisers on which phonetic systems are based. In automatic speech recognition, larger subword units and the use ...

متن کامل

Articulatory feature based continuous speech recognition using probabilistic lexical modeling

Phonological studies suggest that the typical subword units such as phones or phonemes used in automatic speech recognition systems can be decomposed into a set of features based on the articulators used to produce the sound. Most of the current approaches to integrate articulatory feature (AF) representations into an automatic speech recognition (ASR) system are based on deterministic knowledg...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

IEEE Trans. Speech and Audio Processing

دوره 10 شماره

صفحات -

تاریخ انتشار 2002

Automatic generation of subword units for speech recognition systems

نویسندگان

چکیده

منابع مشابه

Automatic clustering and generation of contextual questions for tied states in hidden Markov models

Towards weakly supervised acoustic subword unit discovery and lexicon development using hidden Markov models

Incorporating linguistic knowledge and automatic baseform generation in acoustic subword unit based speech recognition

Constrained Subword Units for Speaker Recognition

Articulatory feature based continuous speech recognition using probabilistic lexical modeling

عنوان ژورنال:

اشتراک گذاری